Vision

The Vision module provides APIs for text recognition tasks.
It supports recognizing text from static images or by scanning documents using the camera.


Types

RecognizedText

Represents a block of recognized text.

  • content: string
    The recognized text content.

  • confidence: number
    Confidence level (between 0.0 and 1.0) where 1.0 indicates the highest confidence.

  • boundingBox: { x: number, y: number, width: number, height: number }
    The bounding box of the recognized text in normalized coordinates.


RecognizeTextOptions

Configuration options for text recognition.

  • recognitionLevel?: "accurate" | "fast"
    Recognition mode:

    • "accurate" (default): Prioritizes accuracy.
    • "fast": Prioritizes speed.
  • recognitionLanguages?: string[]
    Preferred recognition languages in ISO language codes, in priority order.

  • usesLanguageCorrection?: boolean
    Whether to apply automatic language correction during recognition.

  • minimumTextHeight?: number
    Minimum text height to recognize, relative to image height (default 0.03125).

  • customWords?: string[]
    Custom vocabulary to prioritize during word recognition. Only effective when usesLanguageCorrection is true.


Functions

recognizeText(image: UIImage, options?: RecognizeTextOptions): Promise<{ text: string, candidates: RecognizedText[] }>

Recognizes text from the provided image.

  • Parameters:

    • image: A UIImage object.
    • options (optional): Recognition options.
  • Returns:
    A Promise resolving with:

    • text: All recognized text combined into a single string.
    • candidates: Array of recognized text blocks with details.

scanDocument(options?: RecognizeTextOptions): Promise<string[]>

Scans a document using the device's camera and recognizes text.

  • Parameters:

    • options (optional): Recognition options.
  • Returns:
    A Promise resolving with an array of recognized text documents.
    If the user cancels, the Promise rejects with an error.


Usage Examples

Recognize text from an image file

1const image = UIImage.fromFile('/path/to/image.png')
2if (image) {
3  const result = await Vision.recognizeText(image, {
4    recognitionLevel: 'accurate',
5    recognitionLanguages: ['en', 'zh-Hans'],
6    usesLanguageCorrection: true
7  })
8  console.log('Recognized Text:', result.text)
9
10  for (const block of result.candidates) {
11    console.log(`Text: ${block.content}, Confidence: ${block.confidence}`)
12  }
13}

Scan a document with camera

1try {
2  const documents = await Vision.scanDocument({
3    recognitionLevel: 'fast',
4    recognitionLanguages: ['en']
5  })
6  console.log('Scanned Documents:', documents)
7} catch (error) {
8  console.error('Scan cancelled or failed:', error)
9}